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Abstract 

In this paper, we provide explicit constructions for a class of exact-repair regenerating codes that possess a 
layered structure. These regenerating codes correspond to interior points on the storage-repair-bandwidth tradeoff, 
and compare very well in comparison to scheme that employs space-sharing between MSR and MBR codes. For 
the parameter set (n, k,d = k) with n < 2fc — 1, we construct a class of codes with an auxiliary parameter w, 
referred to as canonical codes. With w in the range n — k < w < k, these codes operate in the region between the 
MSR point and the MBR point, and perform significantly better than the space-sharing line. They only require a 
field size greater than w + n— k. For the case of {n, n — 1, ?i — 1), canonical codes can also be shown to achieve an 
interior point on the line-segment joining the MSR point and the next point of slope-discontinuity on the storage- 
repair-bandwidth tradeoff Thus we establish the existence of exact-repair codes on a point other than the MSR and 
the MBR point on the storage-repair-bandwidth tradeoff. We also construct layered regenerating codes for general 
parameter set (n, k < d. k), which we refer to as non-canonical codes. These codes also perform significantly better 
than the space-sharing line, though they require a significantly higher field size. All the codes constructed in this 
paper are high-rate, can repair multiple node-failures and do not require any computation at the helper nodes. We 
also construct optimal codes with locality in which the local codes are layered regenerating codes. 

I. Introduction 

A. Regenerating Codes 

In a distributed storage system, information pertaining to a single file is distributed across multiple nodes. In 
the present context, a file is a collection of K symbols drawn from a finite field ¥q of size q. Thus a file can be 
represented as a (1 x K) vector over ¥q. A data collector should be able to retrieve the entire file by downloading data 
from any arbitrary set of k nodes. Since the nodes are prone to failure, the system should be able to repair a failed 
node by downloading data from the remaining active nodes. In the framework of regenerating codes introduced in 
01, a codeword is a F^-matrix of size (a x n), where each column corresponds to the data stored by a single node. 
A failed node is regenerated by downloading /3 < a symbols from any arbitrary set of d nodes. These d nodes are 
referred to as helper nodes. Since the entire file can be recovered from any arbitrary set of k nodes, we must have 

k <d<n-l. 

The total bandwidth consumed for the repair of a single node equals d/3 and is termed the repair bandwidth. 
Thus a regenerating code is parameterized by the ordered set, (Fg, (n, k, d), (a, K). 

In the framework of regenerating codes, it is not required that the replacement node contain exactly the same 
symbols as did the failed node. It is only required that following regeneration, the network possess the same 
properties with regard to data collection and node repair as it did prior to node failure. Thus one distinguishes 
between functional and exact repair in a regeneration code, [B, 121 , |l3J. The present paper is concerned only with 
exact repair. 

A regenerating code is said to be linear if the encoded block of {a x n) matrix is a linear transformation of 
the (1 X i?)-size file vector Linear codes offer the advantage that data recovery and node regeneration can be 
accomplished through low-complexity, linear operations over the field Fg. The regenerating codes constructed in 
the present paper have the additional feature that no computations are needed at a helper node, a simple transfer of 
the contents of the helper node suffice. We will term such regenerating codes as help-by-transfer regenerating codes. 
This is distinct from the class of help-by-transfer regenerating codes discussed in (4) which have the additional 
feature that no computations are needed even at the replacement node. 
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{biren,vijay}@ece.iisc.ernet.in). This research is supported in part by the National Science Foundation under Grant 0964507 and in part by 
the NetApp Faculty Fellowship program. 
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B. The Classical Storage-Repair-Bandwidth Tradeoff 

A major result in the field of regenerating codes is the proof in |[T|| that uses the cut-set bound of network coding 
to establish that the parameters of a regenerating code must necessarily satisfy the inequality 

fc-i 

K <^ia\n{a,{d-i)fi). (1) 

i=0 

Optimal regenerating codes are those for which equality holds in ([T]). It turns out that for a given value of K, k, d, 
there are multiple pairs (a, /3) for which equality holds in ([T]). It is desirable to minimize both a as well as /3 since 
minimizing a reduces storage requirements, while minimizing /3 results in a storage solution that minimizes repair 
bandwidth. It is not possible to minimize both a and /3 simultaneously and thus there is a tradeoff between choices 
of the parameters a and /3. The two extreme points in this tradeoff are termed the minimum storage regeneration 
(MSR) and minimum bandwidth regeneration (MBR) points respectively. The parameters a and /3 for the MSR 
point on the tradeoff can be obtained by first minimizing a and then minimizing /3 to obtain 

K = ka (2) 
a = {d-k + l)(3 (3) 

Reversing the order leads to the MBR point which thus corresponds to 

K = dk- r] (4) 



.2, 

a = dp. (5) 
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Storage per node, a Normalised storage overhead, D. 



Fig. 1. Storage-repair-bandwidth tradeoff. Here [n = 131, fc = Fig. 2. Normalised storage-repair-bandwidth tradeoff.[n = 
120, d = 129, B = 725360]. 131, k = 120, d = 129] 

The remaining points on the tradeoff will be referred to as interior points. As the tradeoff is a piecewise linear 
relation, there are k points of slope discontinuity, corresponding to 

a = {d-p)l3, p e {0, 1, • • • fc - 1}. 

Setting p = {k — I) and respectively yields the MSR and MBR points respectively. Thus the remaining values 
of p G {1, ■ ■ ■ k — 2} correspond to interior points. The tradeoff between a and dp is plotted in Fig. [T] for 
(n = 131, k = 120, d = 130) and filesize K = 725360. 

In in, the authors proved that the interior points of the storage-repair-bandwidth-tradeoff cannot be achieved, 
under exact repair. This raises an open question as to how close one can come to the tradeoff at an interior point. 
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C. Normalised Storage-Repair-Bandwidth Tradeoff 

In this subsection, we draw upon [5 | and 16] to introduce a normalized version of the classical storage-repair- 
bandwidth tradeoff which we motivate as follows. Consider a situation where a user desires to store a file of size K 
across n nodes for a time period T with each node storing a symbols. We follow [5| and assume a Poisson-process 
model of node failures under which the number of failures in time T is proportional to the product of T and the 
number of nodes n. We also assume that there is cost associated with both node storage as well as with repair 
bandwidth. The cost of storage is assumed to be proportional to the amount of data stored, i.e., to na. The cost 
of a single node-repair is taken as the amount of data download to repair a node, i.e., dp. For simplicity, we only 
consider the case of single-node repairs in performance comparison, although a similar analysis can be carried out 
for the case of multiple node failures. With this, it follows that if ^{K, T) denotes the cost incurred to store a file 
of size K for a time period T using a particular coding scheme, then 

^{K,T) = {'ysndp + jsna)T (6) 

for some proportionality constants '^b,1s- Hence the average cost incurred in storing one symbol for one unit of 
time is given by 

l{K,T) ndl3 na 

-KT- = ^^IT + ^^i^- 

We will refer to the quantities := ^ and := as the storage overhead and normalized repair bandwidth 
of the code respectively. Thus the average cost is a linear combination of the normalized repair bandwidth as 
well as the storage overhead Q. The rate i? of a code is the inverse of il, i.e.. 

When we set d = n — 7, and a = {d — p)(3, p G {0, 1, • • • , /c — 1}, the tradeoff in ([!]) translates to 

\n 2n{n — 7) / 

e > {k - p){k - p - ^ Q* 

~ n — ^ — p \n 2n(n — 7) J 

where Q* and 0* represent the minimum possible values of Q and respectively. 

A plot of of 0* as a function of Q* when - is varied, is referred to as the normalised tradeoff. Unlike in 
the classical tradeoff, points in the modified tradeoff do not correspond to a fixed file size K, neither to a fixed 
parameter set [n,k,d]. The normalized tradeoff is parameterised for ^ and ^ where the tradeoff corresponds to the 
varying paramter ^, which takes rational values bounded within and ^. However, the plot shown in Fig. [l] is for 
a fixed parameter set [n = 131, k = 120, d = 129]. 

As in O, an asymptotic analysis of normalised storage-repair-bandwidth tradeoff can be done as n scales to 
infinity. In this approach, the following quantities 

K = iim — 

n— ^-oo n 

9 = lim ^, K / 

r d 

A = iim — 

n— ^-oo n 



are fixed as n scales. If one assumes that d = n — 7, where 7 does not scale with n, we obtain A = 1. Note that 
K G [0 1] and, 6* G [0 1]. Here, 6 = correspond to the MBR point, 9 = 1 correspond to the MSR point, and 
6* G (0 1) correspond to the interior points of the storage -repair-bandwidth tradeoff. In this setting, we obtain an 
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asymptotic version of the normalised storage-repair-bandwidth tradeoff as given below. 



i-eK\ 2(1 - ^k) 

In O, the authors have used the asymptotic analysis to study the variation of the tradeoff with respect to k, for 
various points of operation 6. These plots, drawn in Fig. [3} showcase the importance of regenerating codes for the 
interior points of the tradeoff. From Fig. [3] it follows that for any fixed storage overhead, repair bandwidth can be 
minimized by operating with the lowest value of 9 that supports the given storage overhead. For example, if it is 
sufficient to build a distributed storage system with storage overhead > 2, then it is better to operate with 9 = 0, 
i.e., MBR point. Similarly, operating at 6* = 1, i.e. MSR point, is desirable only when required storage overhead is 
very close to 1. When the permissible storage overhead falls in the range 1 < < 2, it is desirable to use codes 
that operate in the range < 9 < 1, i.e. in the interior region of the tradeoff. 




1 1.5 2 2.5 3 3.5 

Storage Overhead, SI 



Fig. 3. Asymptotic normalised storage-repair-bandwidth tradeoff, as a function of k, for various 6. 



D. Existing Coding Schemes with Exact Repair 

Several coding schemes have been proposed in the literature in the exact-repair setting. In |[T4l . a framework to 
construct exact-repair optimal regenerating codes at the MBR and MSR points is provided. The framework permits 
the construction of MBR codes for all values for [n, k, d], and of MSR codes for d < 2A; — 3. In lITSl . high-rate MSR 
codes with parameters [n,k = n — 2,d = n — 1] are constructed using Hadamard designs. In fl^, high-rate MSR 
codes are constructed for d = n — 1; here efficient node-repair is guaranteed only in the case of systematic nodes. 
A construction for MSR codes with (i = n — 1>2A; — lis presented in ifTTl and ifTSl . The construction of MSR 
codes for arbitrary values of [n,k,d] remains an open problem, although it has been proven in [W] that exact-repair 
MSR codes exist for any parameter set [n,k,d] as the filesize grows to infinity. In |4J, a construction for a family 
of repair-by-transfer MBR codes is presented. The construction of regenerating codes for a functional-repair setting 
may be found in ll20i and ifTSll . The nonexistence of exact-repair codes that achieve the classical storage -repair 
bandwidth tradeoff is proven in H. 
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E. Vector Codes 

Regenerating codes can also be viewed as vector codes. An [n, K, dmin, a] linear vector code C over a field ¥g is 
a subset of (F^ )" for some a > I, such that given c, c' G C and a,b £ ¥q, ac + be' also belongs to C. A codeword 
of a vector code is a matrix in FJ^^", and a code symbol of a codeword is a vector in F^ . As a vector space over 
¥q, C has dimension K, termed the scalar dimension of the code. The Hamming distance between two codewords 
is the number of codesymbol vectors at which they differ. In this sense, the code has minimum distance dmin- 

Associated with the vector code C is an Fg-linear scalar code C^^^ of length = na, where C^'^^ is the collection 
of (1 X na) vectors obtained by vectorising each codeword matrix in some prescribed order. Given a generator 
matrix G for the scalar code C^^\ the first code symbol in the vector code is naturally associated with the first a 
columns of G and so on. We will refer to the collection of a columns of G associated with the code symbol Cj 
as the i'^ thick column. We will refer to the columns of G themselves as thin columns in order to avoid confusion, 
and thus there are a thin columns per thick column of the generator matrix. 

F. Locality 

Codes for distributed storage have been studied from other perspectives, different from the setting of regenerating 
codes. One prominent direction is related to codes with locality [7]. In this class of codes, a failed node is repaired 
by downloading entire data from a few set of nodes. Thus the property of locality allows to minimise the number of 
node accesses during repair. If locality-property holds only for systematic nodes, then it is referred to as information 
locality, and if it holds for all nodes, it is referred to as all-symbol locality. Scalar codes(i.e., a = 1) with locality 
was introduced in Q, for the case of single symbol erasure, and subsequently extended in fSl, for the case of 
multiple erasures. An upperbound on the minimum distance of the scalar code with locality was derived in above 
papers. Scalar codes with information locality that are optimal with respect to the aforesaid bound were constructed 
in 191. Optimal scalar all-symbol local codes were constructed in HI and ifTOl . Another class of codes ifTTl . named as 
homomorphic self-repairing codes, constructed using linearized polynomials also turns out to be optimal scalar all- 
symbol codes. Recently, the concept of locality was studied for vector codes (i.e., a > 1) in fSl, ||T2| and fT3], and 
thereby making this class of codes to be a comparable alternative to regenerating codes. Codes combining benefits 
of regenerating codes and codes with locality were constructed in Q and lll2ll . In lfT2l . the authors consider codes 
with all-symbol locality where the local codes are regenerating codes. Bounds on minimum distance are provided 
and a construction for optimal codes with MSR all-symbol locality based on linearized polynomials (rank-distance 
codes) is presented. 

G. Gabidulin Codes 

Let Q = {g{x) = X]£o^ 9i^^' I 9i ^ denote the set of all linearized polynomials of g-degree < [D — 1) over 
FgN, and let {Pj}^^, N > K > D,he a collection of linearly independent elements over Fg in F^w. Consider for 
each g £ Q, the vector {g{Pi), g{P2), ■ ■ ■ ,g{PK))- By representing each element g{Pi) as an A^-element vector 
over ¥q, we obtain an {N x K) matrix over ¥q. The resultant collection of matrices turns out to form a maximal 
rank distance (MRD) code known as the Gabidulin code [21]. In the current paper, we will in several places deal 
with vectors of the form {g{Pi), • • • ,g{PK)), and it follows that these may also be regarded as codewords drawn 
from the Gabidulin code. 

H. Results 

In this paper, we first construct an (n, k,d = A;) -regenerating code having a layered structure which we term as 
the canonical code Ccan ■ This code has two auxiliary parameters w and 7 satisfying w>2,^>l,w + 'y<n and 
only requires field size q > w + j. We show how starting from a canonical code, it is possible to build a second 
class of layered regenerating codes with k < d hy making suitable use of linearized polynomial evaluations (or 
equivalently, codewords in the Gabidulin code) as is done in lilil . These codes will be referred to as non-canonical 
regenerating codes. The extension to the case k < d requires however, an expansion in field size from q to q^ where 
K is the scalar dimension of the underlying canonical code. These codes allows help-by-transfer repair("uncoded" 
repair) and are of high-rate. 
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We also show that the canonical code with ^ < w < k always perform better than space-sharing code. Recently, 
Chao et al. proposed a construction of exact-repair codes |[23l using Steiner systems that achieves points better 
than the space-sharing line. [^They consider constructions for d = n — 1. For the particular case of d = n — 1 and 
k = n — 2, the performance of their code is identical to the construction in the present paper when our construction 
is specialized to the same parameter set d = n — 1, A: = n — 2. 

Our constructions with k = d = n — 1 achieve an interior point of the storage-repair-bandwidth tradeoff, that is 
in the middle of the MSR point and the next point of slope-discontinuity. Recently, Chao Ii24il has characterized 
the optimal storage-repair-bandwidth tradeoff of (4, 3, 3)-exact repair codes. It turns out that the (4, 3, 3)-canonical 
code appearing in this paper also achieves the same optimal region. 

Finally, we construct codes with local regeneration following the techniques in ||8l, |[T2ll . in which the local codes 
correspond to the canonical code. 

/. Performance of Codes 

The performance of this class of codes is compared against MBR and MSR codes using the normalized tradeoff. 
The layered codes operate in the interior region between the MSR and MBR points, and the auxiliary parameter 
2 < w < k turns out to determine the specific interior point in the tradeoff. For a wide range of parameters 
(n, k, d), these codes outperform codes that space-share between MSR and MBR codes Figures |4|and|| show the 
respective performance of canonical codes with (n = 61, k = 60, d = 60) and (n = 61, k = 58, d = 58). For the 
case of (n = 61, k = 60, d = 60), and interior point on the tradeoff between the MSR point and the next point of 
slope-discontinuity is achieved with w = 59. Achievability of interior point by canonical construction is depicted 
in the classical storage-repair-bandwidth plot in Fig. |7] for the parameter set {n = 8, k = 7, d = 7) with auxiliary 
parameter w = 6. The performance of non-canonical layered regenerating codes with (n = 61,A; = 55, d = 60) is 
shown in Figj6] As can be seen in plots, the codes come close to the tradeoff in terms of performance. 



In this section, we will describe the construction of a family of high-rate, ((n, k,d = k), (a, /?), K^) regenerating 
codes indexed by two auxiliary parameters w,7 satisfying w > 2,7 > l,w + < n. The code has a layered 
structure, and we will have d = k. The code will be simply referred to as canonical code. The construction we 
provide in this section, assumes (n, ti; + 7) = 1. The general case of (n, w + 7) > 1 will be considered in the next 
section. 

A. Construction of the Canonical Code C 

The construction will make use of certian other parameters derived from w and 7 as defined below. 



II. Construction of the (n, k,d = A;)-Canonical Layered Regenerating Code 



L 




n \w + 7/ 

— lcm(ti;, w + \, - ■ ■ 

w 



(number of patterns) 



V 



w + ^ — I) (repetition factor (of each pattern)) 



M = LV (number of layers) 

Kc = LVnw (scalar dimension of the canonical code). 

The structure of the canonical code, can be inferred from Fig. [8] which shows the four-step process by which the 
incoming message vector u is encoded: 

(a) The Kc-tuple message vector u is first partitioned into LVn w-tuples: 




'Their paper appeared in the public literature only after the initial submission of our paper on arXiv. 

^Exact-repair MSR codes are not known to exist for every value of (n, k, ci)-tuple. Hence the achievability of the space-sharing line joining 
MSR point and MBR point is not always guaranteed. 
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Fig. 4. Plot comparing the performance of the canonical code Fig. 5. Plot comparing performance of the canonical regen- 

with (n = 61, k = d — 60) for varying w G {2, 5,8,..., 59}. erating code with (n = 61, k = 58, d = 58) whiel varying 

The MBR point (which is the degenerate case with w = 1) and w £ {4, 7, 10, ... , 55}. 

the MSR point are also marked. With w — 59, an interior point 
on the tradeoff between the MSR point and the next point of 
slope-discontinuity is achieved. 
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Fig. 6. Plot comparing performance of the non-canonical Fig. 7. (8, 7, 7)-canonical code with w — 6 achieves the 

layered regenerating code with (n = 61, fc = 55, d — 60) interior point, 

while varying w G {2, 3, . . . , 9}. 



(b) Each w-tuple is then encoded using an + 7, tu, 7 + 1] MDS code to yield LVn codewords 

(c) The collection of n codewords {ct''^^}^zI is then "threaded" to form a layer A^^'^^ of the code matrix: 
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(d) 



This threading is carried out with the help of a pattern vr^^). 
are explained below in Sections II-B|II-C 



The nature of a pattern and the threading process 



The LV layers are then stacked to form the code matrix 

^(1,1) 

^(1,2) 



c 
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Threads of 
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form Codeword 


c 
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» 


Encoder 
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"IT" 



Fig. 8. Encoder of the canonical layered regenerating code. 



B. Patterns 

There are ^) subsets of [n] that are of size (w + j). Let us partition these subsets into equivalence classes by 
declaring two elements to be equivalent if one is a cyclic shift of the other. Given our assumption that (n, ^+7) = 1, 
all equivalence classes will contain precisely n elements and hence the number of equivalence classes is given by 
L= i( " ). Let 

p^Mi<^<^} = {i4\4\---,4i,)\i<i<L}, 

be the collection of subsets obtained by selecting one subset from each equivalence class. We will assume that the 
elements within each of the subsets vr^^^ are ordered in ascending numerical order, i.e., 

7rf ^ < < • • • < T^ilij, for all i. 

We will associate with each such ordered subset, a collection of n two-dimensional patterns, each of size {w + 
7) X (ty + 7). This collection includes the fundamental pattern: 

pm(o) = {(i,4^)\i<i<w + j}, 

as well as its n (columnar) cyclic shifts 

pW(r) = |(i,^f^eT) I l<i<zi; + 7}, l<r<(n-l), 

in which irf^ © r is addition modulo n. Given a pattern P^^\t) we will refer to the {w + 7)-tuple 

7r(^) © r = (vrf) © r, vrf ^ © r, • • • , ti^^^ © r), 
as its (columnar) footprint. Thus the footprint of a fundamental pattern p(^)(0) is simply given by vr*^^). 

C. Threading Codewords to Form a Layer 

We fix (£, u) and hence describe the threading process as it applies to the i^)th layer. Consider the collection 
of n codewords {cr^''^^}"^Q associated to a layer. The symbols of the rth codeword c^''^\ < r < n — 1, are 
placed (in any arbitrary order) into the n locations 

pW(r) = {(i,^f^©T) I l<i<u; + 7}, l<r<(n-l), 

identified by the pattern P^^\t). We might also refer to this codeword of this erasure code as a thread. The threading 
yields a (tD + 7 x n) matrix which we will denote by A^^'^\ The threading process is illustrated in Figure M We 
then repeat this process for each layer, i.e., for all pairs {I, u). Finally we vertically stack the matrices A^'^^ to 
obtain the code matrix as described above. With this the encoding process is complete. 
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Fig. 9. Illustrating the threading process. The top left matrix uses an * to identify the elements of the two-dimensional pattern P'''(0). 
The top right matrix shows the entries of a codeword c'/'"' being inserted into the locations identified by the pattern. The bottom left shows 
the codeword S^'^^ inserted into the locations identified by P'*''(l). The bottom right shows the completely filled in layer A'-^'"\ 



D. Parameters of the Canonical Code 

1) Parameters n, a: The parameter n is simply the block length of the code Ccan , viewed as a vector code with 
symbol alphabet F^. The value of a can be computed from nature of the construction and is given by 

a = LV(w + 7) 

n \w + w 

lcm{w,w + 1,- ■ ■ ,w + J — 1) I' n — 1 \ 

w \w + 7 — 1/ 

2) Parameters d, f3: We next note that the value of d can be no less than n — 7 for otherwise, it would not be 
possible in some instances to repair a failed node. This follows from the fact that the symbols of each MDS code 
are spread across {w + 7) distinct nodes and that to repair a failed symbol in an + 7, ti;, 7 + 1] MDS code, one 
needs access to at least w symbols of the codeword. Conversely, it follows that if d = n — 7, then every failed node 
can be repaired. We will set d = n — 7 here. It remains to establish that repair of a failed node can be accomplished 
by connecting to d nodes and downloading a fixed number /3 of symbols from each of the d helper nodes. 

It will be convenient in our analysis to assume that along with the given failed node (say node 771), there are 7 — 1 
other nodes (say, nodes r/j, i = 2, 3, . . . , 7) that have also failed and that the remaining d = n — 7 nodes are acting 
as the helper nodes. Let us assume further, that node h is one of the helper nodes. Our interest is in determining 
the number of symbols that need to be transferred from node h to node rji for the purposes of node repair. We 
had noted earlier in describing the construction of the canonical code, that each layer A^^''^^ of the canonical code 
is composed of n MDS codewords {c^''^^}^zl. The codeword cl^'*^^ is placed in the locations associated to the 
pattern P^^\t). We will refer to the n MDS codes as threads in the description below. 

Node h can transfer one symbol to the replacement for node iji iff there is a thread in some layer to which both 
nodes 771 and h contribute code symbols. We now break up our count according to the total number p of nodes 
that have now failed, but which previously contributed a symbol to the erasure code thread. More specifically, we 
are counting the number of threads such that 

• both nodes 771 and h contribute a single code symbol to that thread 

• [p — 1) of the nodes {rn \ 2 < i < j} each contribute one code symbol to the thread, the remaining failed 
nodes do not contribute any code symbol to the thread 

The total number of such threads, across all the L distinct layers in the code matrix is given by 

/7-l\/ n~7-l \ 
\p—lj \w + 'y - p - ij' 
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Within the erasure code, the situation is that p symbols have been erased and thus a total of w + j — p symbols 
can serve as helper nodes for node rji of which node h is one. Since any w nodes suffice to help node 7]i recover 
from the erasure, it suffices if node h "on average" contributes a fraction 

w 



W + J — p 

of code symbols. We can ensure that this average is realized by calling upon the V repetitions of each layer. The 
number V has been chosen such that for all p, 

w+j-p ^ 
w 

Thus we can ensure that the helper node will always pass on 

V- " 



w + ^ — p 

code symbols when counted across all V repetitions of the corresponding erasure code. It follows that the value 
of P and d are given by 

\p — 1 J \w + p— l/w + 'j— p 

d = n-'j. (14) 

As a check, we note that each column contains a = LV{w + 7) symbols, each of which requires the transfer of 
w symbols to enable repair. Since there are a total of (n — 7) helper nodes, we must have that 



/3(n — 7) = wa, 



i.e., f3 must equal 



P = 7 

(n-7 



w \ I n 



V{w + -t). (15) 



{n — -f) n \w + 7y 

It can be verified that the values for /? obtained in ( [76] ) and ( [17] ) are the same. 

3) Determining k, K and Code Rate R: Arguing as above, if A; < (n — 7), we will fail to decode at least one 
thread. Hence k > {n — On the other hand, by connecting to d = (n — 7) we can recover the entire data and 
hence k = d. The scalar dimension of the code is clearly given by Kc = LVnw. Not surprisingly, the rate R of 
the code is given by 

III. (n, k,d = A;)-CanoNICAL CODE WHEN (n, + 7) / 1 
We consider the general case when (w + 7, n) 7^ 1 and let the integer g be defined by setting 

- = {n,w + ^). 
9 

The differences in the case of {w + ^,n) ^ 1 arise out of how patterns are identified in the canonical code. 
A. Patterns 

We partition as before, the (^"^) subsets of [n] of size (if + 7) into equivalence classes by declaring two subsets 
to be equivalent if one is a cyclic shift of the other. This time, however, different equivalence classes will be of 
different size. The number of elements in an equivalence class will always be of the form gr with r dividing ^. 
Let E{gr) denote the number of equivalence classes of size gr and the total number of equivalence classes by £. 
The values of E{gr) and of £ are given by (proof in the appendix): 

/ 9L 

gr 



s\r 
r:gr\n 



ns 
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where /Lt( ) denotes the Mobius function. Let 



be the collection of subsets obtained by selecting one subset from each equivalence class. We will assume that the 
elements within each of the subsets tt^^^ are ordered in ascending numerical order, i.e., 



7rf ^ < vrf < • • • < Trifle , for all i. 



We will associate with each such subset, a collection of n two-dimensional patterns, each having the same size 
{w + This collection includes the fundamental pattern: 

pM(0) = {{i,4^)\l<i<w + j}, 

as well as its n (columnar) cyclic shifts 

pW(r) = {(i,7rf^ er) I 1 < i < 'u; + 7} , l<r<(n-l). 

Given a pattern P^^^{t) we will refer to the {w + 7)-tuple 



TT 



W«3r = in?er, Trf © r, • • • , vr^i, r), 



as its (columnar) footprint. Thus the footprint of a fundamental pattern P^^^ (0) is simply given by tt^^^ . 



B. Layers of the Canonical Code 
Let us define 



L = ^ rE{gr) 
r:gr\n 

V\ = lcm(u;, U7 + 1, ■ • • ,10 + 7 — 1) 



V 

M 



w 

LV. 



Let us define the function {ug | 1 < ^ < by 

LOg = r if the pattern ir^^^ has period gr. 

It follows that there are E{gr) patterns corresponding to the value ojg = r. Thus we can alternately express L in 
the form 

£ 

L = rE{gr) = J^w^. 

r:gr\n i=l 

Each code matrix C is composed of LV vertical stacked layers, each layer corresponding to a matrix {A^^'^'^^ \ 
l<e<S, l<uj<uj, IKv <V} of size {{w + 7) x n). Thus C is of the form: 

^(1,1,1) 



C = 



The entries of the {A^^''^''^'>} are specified below. 
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C. Threading Codewords to Form a Layer 

The threading process is identical to the case of (n, w + j) = 1. We fix {£, u) and the threading process in the 
(£, z/)th layer is as follows. Consider the collection of n codewords {c^''^^}^zl associated to a layer. The symbols 
of the rth codeword c^''^\ < r < n — 1, are placed (in any arbitrary order) into the n locations 

pW(r) = |(i,^PeT) I l<i<u; + 7}, l<r<(n-l), 

identified by the pattern P^^\t). The threading yields a (u; + 7 x n) matrix which we will denote by A^^''^\ We 
then repeat this process for each layer, i.e., for all pairs {£,1'). It follows that a given pattern P^^\t) determines 
the ordering of code symbols in uiV layers. In loose terms, each pattern is repeated tdiY times and the parameters 
may hence be viewed as repetition parameters. The repetition factor oj^ that is pattern dependent, will help 
as we shall see, ensure a larger code rate, whereas the constant repetition factor V ensures a uniform download 
during node repair as in Section [11] 

Finally we vertically stack the matrices A^^''^'^ to obtain the code matrix. This completes specification of the code 
matrix C. 

D. Parameters of the Canonical Code 

1 ) Parameters n, a: Since the number of layers change, the parameter a is different from the case of (n, w+j) = 
1. It is given by 



a = LV{w + 7) 

{w + 'y) ■ lcm(w, w + 1, - ■ ■ ,w + ^ — 1) 



^ rE{gr) 



w 

r:gr\n 

+ 7) • lcm(w, W + l,---,tt; + 7-l)^^ , ^ — 



yjg I I 

r:gr\n s\r 



(w+l)gr 



2) Determining Parameters d and (3: Following the exact set of arguments in Sec. II-D2 we can show that 
d = n — ^ and /? is given by 



/3 



\p — I J \w + p— l/W + 'J — p 



As a check, we note that each column contains a = LV{w + 7) symbols, each of which requires the transfer of 
w symbols to enable repair. Since there are a total of (n — 7) helper nodes, we must have that 

/3(n — 7) = wa, 

i.e., f3 must equal 

a ^ 

P = 7 ra 

(n - 7) 

w 

= 7 -Aw + j)LV. (17) 

[n - 7) 

It can be verified that the values for /3 obtained in ( [76] ) and ^T7\ are the same. 

3) Determining k,K and Code Rate R: Arguing as earilier, the scalar dimension of the code is clearly given 
by Kc = LVnw. Not surprisingly, the rate R of the code is given by -:^;JJ^■ 

IV. Construction of (n, k < (i)-LAYERED Regenerating Code 

In this section, we will describe the construction of non-canonical layered regeneration code, Circ for general 
parameter set {{n,k,d),{a, j3), K) regenerating codes, again indexed by two auxiliary parameters w,^ satisfying 

2 < u; < fc, 1 < 7 < (n - fc). 
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A. Construction of the non-canonical code Cj,c 

The non-canonical regenerating code Circ makes use of the canonical code code Ccan as shown in Fig. 10 It also 
makes use of linearized polynomials along the lines of their usage in |[T2ll . Since the construction uses the canonical 
code, we need to consider the case of (n, w + 7) = 1 and (n, + 7) > 1 separately. We will consider only the 
case of (n, + 7) = 1, and the general case follows accordingly. 

The K message symbols {mj}^^ of Clrc are first used to construct a linearized polynomial 

K 



E 



The linearized polynomial is then evaluated at Ki^ elements of ^5™ which when viewed as vectors over Fg, 

are linearly independent. The resulting Kc evaluations {f{Oi)} are than fed as input to an encoder for the canonical 
C. We set 



{ui,U2, . . .,UkJ- 

The non-canonical regenerating code is the output of the canonical code to the input u. 



Ui 

u 





Polynomial 
evaluator 


[/(^i) /((?2)---/(exJ] 


Basic Layered 
Regeneration Code 




* 


► 


► 



linearly independent points from ]F"„Af 



Fig. 10. Encoder of a Layered Regenerating Code. 



B. Parameters ofCirc 

Clearly, the parameters n, a are exactly same as that of canonical code. First we proceed to relate k and K of 
Clrc ■ Towards that, we being with presenting a generator-matrix view point of the canonical code. 

1 ) Two generator matrices for the canonical code Ccan ■' Thus far, we have described the code in terms of the 
structure of the codeword, viewed as a layered array. Towards determining k and K of the code, we now turn to a 
generator matrix viewpoint of the code. To obtain a generator matrix, one needs to vectorize the code matrix, thus 
replacing the code matrix by a vector of size na = nLV{w + 7). The generator matrix then describes the linear 
relation between the LVw input symbols of the canonical code C and the na output symbols. Let us set Ni, = na 
and recall that = LVnw. Then the generator matrix is of size {Kh x Nh). 

The generator matrix is clearly dependent upon the manner in which vectorizing of the code matrix takes place. 
We will present two vectorization and hence, two generator matrices: 

(a) From the distributed storage network point of view, the natural vectorization is one in which the Ni, code 
symbols are ordered such that the first a symbols correspond to the elements of the first column vector (in 
top-to-bottom order), of the code matrix, the second a symbols correspond in order, to the elements of the 
second column vector etc. Thus, under this vectorization, we will have that the first a columns of the generator 
matrix correspond to the first column vector of the code matrix and so on. We will refer to this as the canonical 
vectorization of the code. In terms of the vector-code terminology introduced earlier, each set of columns of 
the generator matrix corresponding to a column of the code matrix, is referred to as a thick column of the 
generator matrix. The code symbols associated to the ith thick column of the generator matrix are the code 
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symbols stored in the ith storage node. We will use G to denote the generator matrix of the canonical code C 

under this vectorization. 

(b) Next, consider a second vectorization of the canonical code C and hence, a different generator matrix. The code 
symbols in the code matrix of the canonical code Cam can be vectorized in such a manner that the resultant 
code vector is the serial concatenation of the Mn codewords {(^''^^} of the code Cmds > each associated to 
a distinct message vector u^''^^ . Let Gb-d denote the associated generator matrix of Ccan • Clearly, Gb-d has a 
block-diagonal structure: 



Gb-d = 



Gmds 

Gn, 



Gn, 



(18) 



Here Gmds denotes the generator matrix of the [w+j, if, 7 + 1]-MDS code. It follows from this that the columns 
of Gb-d associated to code symbols belonging to distinct MDS codewords are linearly independent. Also, any 
collection of w columns of Gb-d associated with the same Cmds are linearly independent. 
It is our intent to use the matrix G for generating the canonical code Ccan and the matrix Gb-d for analysis of Ccan ■ 
We note that the two generator matrices G and Gb-d of the code Ccan differ only in the order in which the thin 
columns appear. 

2 ) Rank Accumulation in the Matrix G: The matrix G has the following uniform rank-accumulation property, 
namely that if one selects a set S containing s thick columns drawn from amongst the n thick columns comprising 
G, then the rank the submatrix GI5 of G is independent of the choice of S. Hence the rank of G\s may be denoted 
as Ps, indicating that it just depends on the value of s. 

We now proceed to determine ps- The value of ps depends on how the collection of thin columns in S intersect 
with the blocks of Gb-d. For every thick column of G\s, let us focus on a subset of thin columns corresponding to 
symbols from layers with a fixed value of v. We will refer to the submatrix of G\s thus obtained as G^'^^s- It is 
clear that rank of GI5 is F times the rank of G^'^^\s- The intersection of G^'^^\s with blocks of Gb-d can be sets 
of varying sizes, ranging from to u; -|- 7. If the intersection is of size p, the rank accumulated is min{p, w}, and 
thus it follows that 

min{s,«;+7} / \ / _ \ 

Ps = V E \){ 1 ' )mm{p,w}. (19) 

We define the rank-accumulation profile of the matrix G as the collection of integers {oj}"^^ given by 

ai = pi (20) 
ai = Pi - Pi-i, '2- < i < n. (21) 



It is straightforward to see that 



We will then have that 



ai = a, 1 <i <w, 
Qi = 0, k + 1 < i < n. 



Ps = X]'^'' 1 ^ * ^ 

i=l 

3) Parameters k and K: Having described the rank accumulation profile of the canonical code, we are ready 
to relate k and K of the layered code C^c . We begin with a useful lemma. 

Lemma 4.1: Let k^ be the smallest number of thick columns of the generator matrix G of the canonical code 
Ccan such that the submatrix of G obtained by selecting any k^ thick columns of G results in a submatrix of rank 
> K. Then by connecting to any /cq nodes associated to the regenerating code Circ , a data collector will be able to 
recover the message symbols {mj}^^. 
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Proof: Let S be a collection of thick of kQ thick columns of the matrix G such that 

Rank(G|5) > K. 

The code symbols (ci, C2, • • • , c„) of the layered regenerating code Circ are related to G as shown below 

(C1,C2,--- ,C„) = [/(0l)/(^2) •••/(^kJ][G]. 

Using linearity of /(•), we can write this as 

(ci,C2,-- ■ ,Cn) = f{[Xi X2-- -Xk.WG]), 



in which G is the vector representation of the element 6i G F„iv . Set 



X = [Xi - ■ ■ X^b 



Since the {x^j^J'^ are linearly independent over F^, it follows that 

Rank(X-G|5) = Rank(G|5) 
> K. 

Hence there are at least K linearly independent columns in the matrix product X • G\s and thus the computation 
f {X ■ G\s) yields evaluations of /(•) in at least K linearly independent points of F^n. Since /(•) is of g-degree 
{K — 1), the coefficients of / can be recovered from these K evaluations. ■ 

It follows from the discussion above, that in order to relate the parameters K, k of Circ , it suffices to study 
the canonical code Ccan and determine the smallest number ko of columns of its generator matrix G, such that the 
corresponding sub matrix has rank at least K. But from the uniform rank accumulation property of the generator 
matrix G of the canonical code Ccan this is simply given by 

ko = m.m{k\pk > K} . (22) 

Equivalently, the scalar dimension (or the filesize) of the layered regenerating code C^-c K for a given value of 
k is given by 

K = V rnin{w,p}[ )( 1 ]■ (23) 

4) Parameters d,fi: From the discussion on rank accumulation profile, it follows that the scalar dimension K 
will be strictly greater than wa when w < k, and hence we will have 

wa < K < ka. 

Thus, it is meaningful to have a scheme that repairs a failed node downloading wa symbols. Hence, we follow the 
same repair strategy as in the case of canonical code setting d = n — 7. We can repair any failed node downloading 
a fixed number P of symbols from every helper node. The value of /3 thus obtained would be 

7 



It can also be checked that (n — 7)/? = wa. 



\p—lj \w + j — p— IJw + j — p 



w 

(24) 
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C. Some Remarks on the parameters of Circ 

The following remarks on parameters of Circ are worth mentioning. 

Remark 1: From the description in Sec. |I-A[ it is clear that every regenerating code must satisfy 

{d-k + l)l3 < a < dl3. (25) 

Since layered regenerating codes have 

dp = wa , 

we must have 

A lowerbound on k imposes only a lower limit on the rate, and hence the above constraint does not come along 
with any penalty. 

Remark 2: In the construction, we have assumed the auxiliary parameter w to be greater than 1 because w turns 
out to be the dimension of the common erasure code. Nevertheless, we can consider the extreme case of w = 1, 
where the erasure code becomes a trivial repetition code. In addition, let us set 7 = 1, and hence w + 7 = 2. Then 
for all odd valued n, (n, + 7) = 1 and hence in that case, we have 



V = 1, 
a = n — 1, 
dp = a. 

The code thus obtained is structurally similar to the repair-by-transfer MBR codes and differs only in that the 
underlying MDS code present in the construction of the repair-by-transfer MBR codes in [4| is replaced here by 
an MDS code that is constructed using linearized polynomials. 

Remark 3: If the linearized polynomial of q-degree {K — 1) used in the construction of Ci^ is replaced by an 
(ordinary) polynomial of degree {K — 1), then one can then still go onto to obtain a regenerating code. While this 
code will have smaller field size, it will however, have lesser rate in comparison to the code Che constructed here. 



V. On the Optimality of the canonical code 

In this section, we state two results pertaining to the performance of the canonical code against the storage-repair- 
bandwidth tradeoff. The first result shows that for any (n, k,d = k) parameter set, we can construct canonical codes 
that performs better than what the space-sharing code achieves. In the second, we will establish the achievability 
of an interior point in the storage-repair-bandwidth tradeoff by an exact-repair code when d = k = n — 1. The 
interior point we achieve is on the line-segment joining the MSR point and the next point of slope-discontinuity, 
where the non-achievability results established in [4| does not apply. Both these results follow immediately from 
simple calculations. 

Lemma 5.1: The {n, k,d = A;)-canonical code operates at an (a, d/3)-point that lies between the MSR and MBR 
points, and performs better than the code that space-shares the MSR and MBR point, whenever ^ < w < k. 
Proof: For any regenerating code with d = k, we must have 

a 

-7 < /? < a. 
d 

Since w + 7 < n, we must have w < n — j = d = k. Furthermore, /3 = for the canonical code. Thus it follows 
that code operate at a point between the MSR and MBR point. 
From [|4J, we can express the space-sharing line in the form, 

d{2K - ka) 



k{d-k + l) 

When d = k,ii reduces to 

dl3 = 2K- ka. 



(26) 
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For the canonical code, we have 



K 



w 



na, 



w + 7^ 

and hence, for it to perform better than the space-sharing code, we must have. 



wa < 2 



w 



na — ka. 



It can be verified that the above condition holds whenever {k — w){w — ^) > 0, which is true when 'y < w < k. ■ 
Corollary 5.2: When n < 2k — I, there exist exact-repair (n, k,d = A;) -regenerating codes that operate between 

the MSR and the MBR point performing better than the space-sharing line. 

Proof: An integer value of w satisfying j < w < k can be found when n < 2k — 1. The statement follows 

from that. ■ 
Lemma 5.3: The (re, n — l,n — l)-canonical code achieves an interior point of the storage-repair-bandwidth 

tradeoff, that lies between the MSR point and the next point of slope-discontinuity specified by, 

'k - 2" 



a 



id-ik-2))f3 



k-l 



/3. 



(27) 



Proof: The results in ^ imply that the rank accumulation profile of a linear optimal regenerating code must 
satisfy, 

minja, (d — p + l)/3}, 1 < p < k 
k < p < n 



a 



1<P< 



{d-p + l)p 




d 



d- 
+ 1 



+ 1 

<p <k 



k < p < n 



Thus a linear code is an optimal regenerating code if and only if it satisfies the above rank accumulation profile. 
For a regenerating code with (i/3 = wa, we calculate 

i<p<Lrf(^) + iJ 

Ld(^)+lJ<P<fe (28) 
k < p < n 



Now consider the (re, re — 1, re — l)-canonical code. This means i.e., 7 = 1, and then it follows from (21 1 that 
the rank accumulation profile of the code, 

1 < p < w 



a, 
a - 
0. 



(^^'), w + l<p<k 
k < p < n 



(29) 



For ( [281 ) and ( p9l ) to match, it is also necessary to check that 

' w — 1 



d 



If we choose w = d — I, we obtain 

* 

CLp — CLp 

Furthermore, we must check that ak = at. 



w 



+ 1 



w. 



<^ de {w,w + 1} 

= a, l<p<w = k — 1 



ak 



a 



wa 



k-l 
k - 1 
w'^a 
w + l 



a — 1, 



wa 
w + l 



(30) 



(31) 
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For the canonical code, since {n, w + 1) = {w + 2, w + 1) = 1, 



a 



w + l 



Thus it follows that, 



ak = ak = w, 



showing that {n,n — l,n— 1) -canonical code with it; = d— 1 is an optimal regenerating code. Since the accumulation 
profile has values Up = a,l < p < (k — l) and = a — 1, it achieves a point between the MSR point and the next 
point of slope-discontinuity on the tradeoff. It can be calculated that the interior point thus achieved is specified 



VI. Codes with Canonical-Code-Locality 

In this section, we will briefly describe how it is possible to construct codes with locality in which each of the 
local codes is the canonical (layered regenerating) code Ccan ■ The same technique can also be used to generate 
codes with locality in which the local codes are the layered regeneration codes Circ. 

A. Locality in Vector Codes 

Let C be an [n, K,dmm,a] vector code over a field F^, possessing a {K x na) generator matrix G. The 
code symbol, Cj, is said to have (exact) (r, 6) locality, 6 > 2, if it is possible to puncture the code in coordinates 
corresponding to a set of indices S with i G S, such that the punctured code C\s has length r + 6 — I, and minimum 
distance 6. The code C is said to have (r, 6) all-symbol locality if all code symbols have (r, S) locality. The codes 
obtained through puncturing will be called local codes. Our interest here is in the construction of a code with exact, 
all-symbol locality, whose local codes correspond to the canonical code Ccan introduced in Section [II] 

The property of locality allows to minimise the number of node accesses during node -repair. The concept of 
locality was introduced in [7| for scalar codes for single erasures. Subsequently it was generalised to multiple 
erasures and later to vector codes; see HI, |[T3l . ||5l and |[T2l . lITTI . Codes combining benefits of regenerating codes 
and codes with locality are constructed in lITll . jSl and ll22ll . 

B. Code Construction 

Let t > 2 and {(pi}l^l a collection of elements in F^n and let {(p.} denote the representation of the {cpi} as 
elements of F^. Given a message vector [mi, m2 • . . , itik]'^, we construct the linearized polynomial 



and form the ti^c-tuple [h{(j)i), h{4>2), • • • , This evaluation vector is then partitioned into t evaluation 

vectors each counting components which are then fed to t respective encoders for the canonical code. The 
corresponding outputs of these encoders are then concatenated to form the desired codeword. It can be shown that 
the resultant code is optimal in terms of having the best possible minimum distance for the given scalar dimension. 



by ([27). 



K 




1=1 
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Appendix 

Lemma A.1: For r such that gr\n, the number of equivalence classes of size gr is given by 

J- X ^ / \ / p \ 



(w+7)gr j ■ 



In particular, the total number of equivalence classes £ is given by 

£ = Y,E{gT). 

Proof: For r such that gr \ n, let /i(r) denote the number of equivalence classes of size less than or equal to 
gr. Then /i(r) is given by 

= ( {w+'r)gr J • 
^ n ^ 

Let /2('") denote the number of patterns having size equal to gr. Then we have, 

s\r 

and by Mobius inversion, we obtain 

h{r) = Y.h(^-^^i{s), (32) 

s\r 

where ji is the Mobius function. Thus the number of equivalence classes of size gr is given by 

E{gr) = -/2(r) 
gr 



= -E/. (-)"(») 

gr ^-^ \sJ 

s\t 



s\r 

The result for the total number of equivalence classes follows immediately. 
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